9E1. Which of the following is a requirement of the simple
Metropolis algorithm?
(1) The parameters must be discrete.
The parameters for Metropolis algorithm can be discrete or they can take
on a continuous range of values.
(2) The likelihood function must be Gaussian.
One of the main pros of using MCM is that we do not have to know the
shape of the posterior distribution.
(3) The proposal distribution must be symmetric.
This is a requirement for the simple Metropolis algorithm.
9E2. Gibbs sampling is more efficient than the Metropolis
algorithm. How does it achieve this extra efficiency? Are there any
limitations to the Gibbs sampling strategy?
The Gibbs sampling is more efficient than the Metropolis algorithm
because it can get a good estimate of the posterior with fewer samples
than with the simple Metropolis algorithm. How does it do this/ It uses
adaptive proposals where the distribution of proposed parameter values
adjust depending upon the parameter values at the moment by using
particular combination of prior distributions and likelihoods. There are
two main limitations to the Gibbs sampling strategy.
9E3. Which sort of parameters can Hamiltonian Monte Carlo not
handle? Can you explain why?
Hamiltonian Mote Carlo can only handle continuous parameters, meaning it
cannot handle discrete parameters. This Hamiltonian Monte Carlo method
uses a physics simulation that requires that the “vehicle’ can stop at
any point and glide across the surgace of the log-posterior instead
of”jumping” from one discrete location to the next”. That is what makes
this method different.
9E4. Explain the difference between the effective number of
samples, n_eff as calculated by Stan, and the actual number
of samples.
The n_eff is the effective number of samples meaning it
answers the question: “how long would the chain be, if each sample was
independent of the one before it?”. This question hints at what is
different about this versus the actual number of samples. Markov chains
are usually auto-correlated meaning they are not independent…. this
means that the raw number of samples is influenced by this dependence of
the samples while the effective number of samples are not. Additionally,
the effective number of samples is nearly always smaller than the number
of samples taken.
9E5. Which value should Rhat approach, when a
chain is sampling the posterior distribution correctly?
The \(\hat{R}\) should approach 1 from
above when the chain is sampling the posterior distribution correctly.
When \(\hat{R}\) is above one it
usually indicates that the chain has not yet converged and you shouldn’t
trust the samples produced.
9E6. Sketch a good trace plot for a Markov chain, one that is
effectively sampling from the posterior distribution. What is good about
its shape? Then sketch a trace plot for a malfunctioning Markov chain.
What about its shape indicates malfunction?
# first let's define what the model will try to estimate
library(rethinking)
## Loading required package: rstan
## Loading required package: StanHeaders
##
## rstan version 2.26.13 (Stan version 2.26.1)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
## change `threads_per_chain` option:
## rstan_options(threads_per_chain = 1)
## Loading required package: parallel
## Loading required package: dagitty
## rethinking (Version 2.01)
##
## Attaching package: 'rethinking'
## The following object is masked from 'package:stats':
##
## rstudent
y <- (c(-2, 2))
# now we can create our models,
# I will create one with extremely flat priors first
# to show a malfunctioning Markov chain
set.seed(1101)
p <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(0, 1000),
sigma ~ dexp(0.0001)
),
data = list(y = y), chains = 3, cores = 3
)
## Warning: There were 143 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: The largest R-hat is 1.06, indicating chains have not mixed.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#r-hat
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#tail-ess
precis(p)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 53.02643 278.7496 -207.366598 662.8925 67.60171 1.051120
## sigma 483.15127 1444.8990 5.392308 1919.9878 205.05705 1.013936
# this is not good, we made our a centered on 0 and it is no where near that
# also the sigma looks insane
# let's make a trace plot
traceplot(p)
With the extremely flat priors, you can see that the chains are
sampling extreme and implausible values (see that alpha is sample all
the way into the 1500s for values….). These chains are not healthy. A
healthy chain will have “tighter” distribution, thus convergence.
Let’s look at healthy chains.
# again let's define what the model will try to estimate
y <- c(-2, 2)
# now we can create our models,
# I will create one with weakly regularizing priors
# alpha will still be almost flat, but not extremely flat, and now centered around 1
set.seed(1101)
p2 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(0, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 3, cores = 3
)
precis(p2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a -0.05984984 1.6196603 -2.575554 2.469738 678.2408 1.002644
## sigma 2.29352820 0.9716406 1.181763 4.055242 492.8094 1.002404
# this looks much better, the sigma is way smaller
# and at least the alpha is somehwere near 0
# let's make a traceplot
traceplot(p2)
The first thing I notice is the warning message about divergent
transitions is gone, this is a good sign. Next, looking at the values
from the precis() we see that the sigma is way smaller and
the alpha falls within the expect range which is also great. Lastly,
looking at the trace plots, we can see that they look much healthier.
The sampling is not occurring out into the thousands and the chains are
stable/stationary are the same values.
9E7. Repeat the problem above, but now for a trace rank
plot.
Now let’s just insert p and p2 into trankplot and
observe. (Well in a different R-universe I would, but I did something
that will give me the trace rank plot - thank you Nico again :) )
library(bayesplot)
## This is bayesplot version 1.9.0
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
## * Does _not_ affect other ggplot2 plots
## * See ?bayesplot_theme_set for details on theme setting
library(aptheme)
# round one
# get the stan code
model_stan <- stancode(p)
## data{
## vector[2] y;
## }
## parameters{
## real a;
## real<lower=0> sigma;
## }
## model{
## real mu;
## sigma ~ exponential( 1e-04 );
## a ~ normal( 0 , 1000 );
## mu = a;
## y ~ normal( mu , sigma );
## }
# fit the model in stan
model_fit <- stan(model_code = model_stan, data = list(y = y))
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.023 seconds (Warm-up)
## Chain 1: 0.02 seconds (Sampling)
## Chain 1: 0.043 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.025 seconds (Warm-up)
## Chain 2: 0.011 seconds (Sampling)
## Chain 2: 0.036 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.032 seconds (Warm-up)
## Chain 3: 0.014 seconds (Sampling)
## Chain 3: 0.046 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 3e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.02 seconds (Warm-up)
## Chain 4: 0.011 seconds (Sampling)
## Chain 4: 0.031 seconds (Total)
## Chain 4:
## Warning: There were 595 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: The largest R-hat is 1.07, indicating chains have not mixed.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#r-hat
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#tail-ess
# then to make the trank plot use this code
mcmc_rank_overlay(model_fit) + theme_ap(family = "")
# round two
# get the stan code
model_stan2 <- stancode(p2)
## data{
## vector[2] y;
## }
## parameters{
## real a;
## real<lower=0> sigma;
## }
## model{
## real mu;
## sigma ~ exponential( 1 );
## a ~ normal( 0 , 10 );
## mu = a;
## y ~ normal( mu , sigma );
## }
# fit the model in stan
model_fit2 <- stan(model_code = model_stan2, data = list(y = y))
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.012 seconds (Warm-up)
## Chain 1: 0.011 seconds (Sampling)
## Chain 1: 0.023 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.011 seconds (Warm-up)
## Chain 2: 0.01 seconds (Sampling)
## Chain 2: 0.021 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 2e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.011 seconds (Warm-up)
## Chain 3: 0.011 seconds (Sampling)
## Chain 3: 0.022 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 2e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.02 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.011 seconds (Warm-up)
## Chain 4: 0.012 seconds (Sampling)
## Chain 4: 0.023 seconds (Total)
## Chain 4:
# then to make the trank plot use this code
mcmc_rank_overlay(model_fit2) + theme_ap(family = "")
In the first trace plot, we see that the chains do not stay within
the same range. Some of the chains start up at 200. There seems to not
be convergence.
For the second trace plot, we see that the chains look way more
healthy. They all stay within the same range which is great!
9M1. Re-estimate the terrain ruggedness model from the
chapter, but now using a uniform prior for the standard deviation,
sigma. The uniform prior should be dunif(0,1). Use ulam to
estimate the posterior. Does the different prior have any detectable
influence on the posterior distribution of sigma? Why or why
not?
# first lets load the data and manipulate them as needed
library(tidybayes.rethinking)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::extract() masks rstan::extract()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::map() masks rethinking::map()
library(tidybayes)
data(rugged)
d <- tibble(rugged)
d <- d %>%
drop_na(rgdppc_2000) %>%
mutate(log_gdp = log(rgdppc_2000)) %>%
mutate(log_gdp_std = log_gdp / mean(log_gdp)) %>%
mutate(rugged_std = rugged / max(rugged)) %>%
mutate(cid = ifelse(cont_africa == 1, 1, 2))
# next, we will select only the things we need for our model
d <- d %>%
mutate(cid = as.integer(cid)) %>%
select(log_gdp_std, rugged_std, cid)
# next, sample from the posterior with ulam, copying exact code
r <- ulam(
alist(
log_gdp_std ~ dnorm(mu, sigma),
mu <- a[cid] + b[cid] * (rugged_std - 0.215),
a[cid] ~ dnorm(1, 0.1),
b[cid] ~ dnorm(0, 0.3),
sigma ~ dexp(1)
),
data = d, chains = 4, cores = 4
)
# now we will run the same model but with the sigma changed
r2 <- ulam(
alist(
log_gdp_std ~ dnorm(mu, sigma),
mu <- a[cid] + b[cid] * (rugged_std - 0.215),
a[cid] ~ dnorm(1, 0.1),
b[cid] ~ dnorm(0, 0.3),
sigma ~ dunif(0, 1)
),
data = d, chains = 4, cores = 4
)
precis(r, depth = 2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 0.8870435 0.015664803 0.86123826 0.91146260 2326.522 0.9995860
## a[2] 1.0505163 0.009858319 1.03499980 1.06648076 3061.597 0.9989312
## b[1] 0.1323087 0.075608251 0.01010434 0.25274629 2934.545 0.9997414
## b[2] -0.1429479 0.054415521 -0.22743265 -0.05563444 2287.316 0.9998117
## sigma 0.1118088 0.006171219 0.10265540 0.12202137 2157.591 0.9999643
traceplot(r)
# get the stan code
model_stan3 <- stancode(r)
## data{
## vector[170] log_gdp_std;
## vector[170] rugged_std;
## int cid[170];
## }
## parameters{
## vector[2] a;
## vector[2] b;
## real<lower=0> sigma;
## }
## model{
## vector[170] mu;
## sigma ~ exponential( 1 );
## b ~ normal( 0 , 0.3 );
## a ~ normal( 1 , 0.1 );
## for ( i in 1:170 ) {
## mu[i] = a[cid[i]] + b[cid[i]] * (rugged_std[i] - 0.215);
## }
## log_gdp_std ~ normal( mu , sigma );
## }
# fit the model in stan
model_fit3 <- stan(model_code = model_stan3, data = d)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 4.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.43 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.096 seconds (Warm-up)
## Chain 1: 0.074 seconds (Sampling)
## Chain 1: 0.17 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.1 seconds (Warm-up)
## Chain 2: 0.087 seconds (Sampling)
## Chain 2: 0.187 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.095 seconds (Warm-up)
## Chain 3: 0.09 seconds (Sampling)
## Chain 3: 0.185 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.094 seconds (Warm-up)
## Chain 4: 0.086 seconds (Sampling)
## Chain 4: 0.18 seconds (Total)
## Chain 4:
# then to make the trank plot use this code
mcmc_rank_overlay(model_fit3) + theme_ap(family = "")
precis(r2, depth = 2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 0.8863272 0.015887687 0.86030331 0.91140563 2621.682 0.9985145
## a[2] 1.0503687 0.010386786 1.03423514 1.06711383 2775.867 1.0007820
## b[1] 0.1307645 0.072662859 0.01621594 0.24353226 2732.227 1.0003857
## b[2] -0.1446808 0.056003071 -0.23327339 -0.05667087 2757.077 0.9985688
## sigma 0.1118433 0.006343597 0.10231364 0.12222374 2251.760 0.9998250
traceplot(r2)
# get the stan code
model_stan4 <- stancode(r2)
## data{
## vector[170] log_gdp_std;
## vector[170] rugged_std;
## int cid[170];
## }
## parameters{
## vector[2] a;
## vector[2] b;
## real<lower=0,upper=1> sigma;
## }
## model{
## vector[170] mu;
## sigma ~ uniform( 0 , 1 );
## b ~ normal( 0 , 0.3 );
## a ~ normal( 1 , 0.1 );
## for ( i in 1:170 ) {
## mu[i] = a[cid[i]] + b[cid[i]] * (rugged_std[i] - 0.215);
## }
## log_gdp_std ~ normal( mu , sigma );
## }
# fit the model in stan
model_fit4 <- stan(model_code = model_stan4, data = d)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 5.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.51 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.097 seconds (Warm-up)
## Chain 1: 0.093 seconds (Sampling)
## Chain 1: 0.19 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.097 seconds (Warm-up)
## Chain 2: 0.084 seconds (Sampling)
## Chain 2: 0.181 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.098 seconds (Warm-up)
## Chain 3: 0.095 seconds (Sampling)
## Chain 3: 0.193 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.094 seconds (Warm-up)
## Chain 4: 0.088 seconds (Sampling)
## Chain 4: 0.182 seconds (Total)
## Chain 4:
# then to make the trank plot use this code
mcmc_rank_overlay(model_fit4) + theme_ap(family = "")
# let's run a prior simulation
# and look at the posterior of the two sigmas
# prior
x2 <- runif(1e4, 0, 1)
x <- rexp(1e4, 1)
dens(x2, main = "Priors", xlab = "", ylab = "")
dens(x, add = TRUE)
# posterior
xa <- tidy_draws(r)
x2a <- tidy_draws(r2)
dens(xa$sigma,
main = "Posterior", xlab = "", ylab = "",
ylim = c(0, 70)
)
dens(x2a$sigma, add = TRUE, )
The different prior doesn’t seem to have a detectable difference on
the posterior distribution. I believe this is because even though the
shapes of the priors are different, once the samples are taken from the
posterior, we see that the shapes seems to become almost identical Maybe
this is due to the amount of data present? I am not sure. However, I
think there is no difference because the shapes are similar.
9M2. Modify the terrain ruggedness model again. This time,
change the prior for b[cid] to dexp(0.3).What does this do
to the posterior distribution? Can you explain it?
# again lets have the data
data(rugged)
d2 <- tibble(rugged)
d2 <- d2 %>%
mutate(log_gdp = log(rgdppc_2000)) %>%
drop_na(rgdppc_2000) %>%
mutate(log_gdp_std = log_gdp / mean(log_gdp)) %>%
mutate(rugged_std = rugged / max(rugged)) %>%
mutate(cid = ifelse(cont_africa == 1, 1, 2))
# next, we will select only the things we need for our model
d2 <- d2 %>%
select(log_gdp_std, rugged_std, cid)
# next, sample from the posterior with ulam, copying exact code
rb <- ulam(
alist(
log_gdp_std ~ dnorm(mu, sigma),
mu <- a[cid] + b[cid] * (rugged_std - 0.215),
a[cid] ~ dnorm(1, 0.1),
b[cid] ~ dnorm(0, 0.3),
sigma ~ dexp(1)
),
data = d2, chains = 4, cores = 4
)
# now we will run the same model but with the sigma changed
rb2 <- ulam(
alist(
log_gdp_std ~ dnorm(mu, sigma),
mu <- a[cid] + b[cid] * (rugged_std - 0.215),
a[cid] ~ dnorm(1, 0.1),
b[cid] ~ dexp(0.3),
sigma ~ dexp(1)
),
data = d2, chains = 4, cores = 4
)
precis(rb, depth = 2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 0.8861604 0.016202413 0.86053607 0.91222119 2791.952 0.9986630
## a[2] 1.0505696 0.010520927 1.03346184 1.06758328 3279.849 0.9997307
## b[1] 0.1323073 0.072604803 0.01481703 0.24492821 1852.260 1.0025353
## b[2] -0.1435128 0.056946382 -0.23436338 -0.05184164 2779.847 0.9992213
## sigma 0.1115004 0.006338309 0.10184982 0.12198239 2189.879 1.0011387
traceplot(rb)
precis(rb2, depth = 2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 0.88736562 0.016268254 0.860921174 0.91319208 1590.022 0.9996236
## a[2] 1.04793272 0.010445478 1.031403789 1.06420018 1814.056 0.9996274
## b[1] 0.14819080 0.071466188 0.037748947 0.27150958 1345.454 1.0000394
## b[2] 0.01806656 0.017401129 0.001032106 0.05208636 1717.317 0.9992318
## sigma 0.11416145 0.006383088 0.104558868 0.12476776 1358.688 1.0057934
traceplot(rb2)
# let's run a prior simulation
# and look at the posterior of the two sigmas
# prior
set.seed(1103)
xb <- rnorm(1e4, 0, 0.3)
xb2 <- rexp(1e4, 0.3)
#priors
dens(xb2,
main = "Priors", xlab = "", ylab = "",
ylim = c(0, 1.4)
)
dens(xb, add = TRUE)
# posterior
xab <- tidy_draws(rb)
x2ab <- tidy_draws(rb2)
# Africa
dens(xab$`b[1]`,
main = "Posterior for continents in Africa",
xlab = "", ylab = "", ylim = c(0, 5.5)
)
dens(x2ab$`b[1]`, add = TRUE)
#not Africa
dens(xab$`b[2]`,
main = "Posterior for continents not in Africa",
xlab = "", ylab = "", ylim = c(0, 40)
)
dens(x2ab$`b[2]`, add = TRUE)
Once again, the different prior doesn’t seem to have a detectable
difference on the posterior distribution (maybe slightly for b[2]). I
believe, once again, this is because even though the shapes of the
priors are different, once the samples are taken from the posterior, we
see that the shapes seems to become almost identical More evidence that
this is true is tahat we see the posterior for b[2] is not similar is
manifested by the difference present in precis
function.
9M3. Re-estimate one of the Stan models from the chapter, but
at different numbers of warm-up iterations. Be sure to use the same
number of sampling iterations in each case. Compare the
n_eff values. How much warm-up is enough?
# let's just go ahead and continue with what we did above in E6
y <- c(-2, 2)
# model 1 with 1 warm up sequence
set.seed(1101)
m1 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 1, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: No variance estimation is
## Chain 1: performed for num_warmup < 20
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 2 / 1000 [ 0%] (Sampling)
## Chain 1: Iteration: 101 / 1000 [ 10%] (Sampling)
## Chain 1: Iteration: 201 / 1000 [ 20%] (Sampling)
## Chain 1: Iteration: 301 / 1000 [ 30%] (Sampling)
## Chain 1: Iteration: 401 / 1000 [ 40%] (Sampling)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 601 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 701 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 801 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 901 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0 seconds (Warm-up)
## Chain 1: 0.005 seconds (Sampling)
## Chain 1: 0.005 seconds (Total)
## Chain 1:
## Warning: There were 999 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
# model 2 with 5 warm up sequence
set.seed(1102)
m2 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 5, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: No variance estimation is
## Chain 1: performed for num_warmup < 20
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 6 / 1000 [ 0%] (Sampling)
## Chain 1: Iteration: 105 / 1000 [ 10%] (Sampling)
## Chain 1: Iteration: 205 / 1000 [ 20%] (Sampling)
## Chain 1: Iteration: 305 / 1000 [ 30%] (Sampling)
## Chain 1: Iteration: 405 / 1000 [ 40%] (Sampling)
## Chain 1: Iteration: 505 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 605 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 705 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 805 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 905 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0 seconds (Warm-up)
## Chain 1: 0.043 seconds (Sampling)
## Chain 1: 0.043 seconds (Total)
## Chain 1:
# model 3 qith 25 warm up sequence
set.seed(1103)
m3 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 25, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 3
## Chain 1: adapt_window = 20
## Chain 1: term_buffer = 2
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 26 / 1000 [ 2%] (Sampling)
## Chain 1: Iteration: 125 / 1000 [ 12%] (Sampling)
## Chain 1: Iteration: 225 / 1000 [ 22%] (Sampling)
## Chain 1: Iteration: 325 / 1000 [ 32%] (Sampling)
## Chain 1: Iteration: 425 / 1000 [ 42%] (Sampling)
## Chain 1: Iteration: 525 / 1000 [ 52%] (Sampling)
## Chain 1: Iteration: 625 / 1000 [ 62%] (Sampling)
## Chain 1: Iteration: 725 / 1000 [ 72%] (Sampling)
## Chain 1: Iteration: 825 / 1000 [ 82%] (Sampling)
## Chain 1: Iteration: 925 / 1000 [ 92%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0 seconds (Warm-up)
## Chain 1: 0.021 seconds (Sampling)
## Chain 1: 0.021 seconds (Total)
## Chain 1:
# model 4 with 50 warm up sequence
set.seed(1104)
m4 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 50, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 7
## Chain 1: adapt_window = 38
## Chain 1: term_buffer = 5
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 51 / 1000 [ 5%] (Sampling)
## Chain 1: Iteration: 150 / 1000 [ 15%] (Sampling)
## Chain 1: Iteration: 250 / 1000 [ 25%] (Sampling)
## Chain 1: Iteration: 350 / 1000 [ 35%] (Sampling)
## Chain 1: Iteration: 450 / 1000 [ 45%] (Sampling)
## Chain 1: Iteration: 550 / 1000 [ 55%] (Sampling)
## Chain 1: Iteration: 650 / 1000 [ 65%] (Sampling)
## Chain 1: Iteration: 750 / 1000 [ 75%] (Sampling)
## Chain 1: Iteration: 850 / 1000 [ 85%] (Sampling)
## Chain 1: Iteration: 950 / 1000 [ 95%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.001 seconds (Warm-up)
## Chain 1: 0.024 seconds (Sampling)
## Chain 1: 0.025 seconds (Total)
## Chain 1:
# model 5 with 100 warm up sequence
set.seed(1105)
m5 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 100, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 15
## Chain 1: adapt_window = 75
## Chain 1: term_buffer = 10
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 101 / 1000 [ 10%] (Sampling)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Sampling)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Sampling)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Sampling)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.002 seconds (Warm-up)
## Chain 1: 0.02 seconds (Sampling)
## Chain 1: 0.022 seconds (Total)
## Chain 1:
# model 6 with 150 warm up sequence
set.seed(1106)
m6 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 150, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 151 / 1000 [ 15%] (Sampling)
## Chain 1: Iteration: 250 / 1000 [ 25%] (Sampling)
## Chain 1: Iteration: 350 / 1000 [ 35%] (Sampling)
## Chain 1: Iteration: 450 / 1000 [ 45%] (Sampling)
## Chain 1: Iteration: 550 / 1000 [ 55%] (Sampling)
## Chain 1: Iteration: 650 / 1000 [ 65%] (Sampling)
## Chain 1: Iteration: 750 / 1000 [ 75%] (Sampling)
## Chain 1: Iteration: 850 / 1000 [ 85%] (Sampling)
## Chain 1: Iteration: 950 / 1000 [ 95%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.003 seconds (Warm-up)
## Chain 1: 0.01 seconds (Sampling)
## Chain 1: 0.013 seconds (Total)
## Chain 1:
# model 7 with 200 warm up sequence
set.seed(1107)
m7 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 200, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 201 / 1000 [ 20%] (Sampling)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Sampling)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Sampling)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.004 seconds (Warm-up)
## Chain 1: 0.014 seconds (Sampling)
## Chain 1: 0.018 seconds (Total)
## Chain 1:
# model 8 with 500 warm up sequence
set.seed(1108)
m8 <- ulam(
alist(
y ~ dnorm(mu, sigma),
mu <- a,
a ~ dnorm(1, 10),
sigma ~ dexp(1)
),
data = list(y = y), chains = 1, warmup = 500, iter = 1000
)
##
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.009 seconds (Warm-up)
## Chain 1: 0.005 seconds (Sampling)
## Chain 1: 0.014 seconds (Total)
## Chain 1:
Notice how with the first model (m1) there is an error message:\ WARNING: There aren’t enough
warmup iterations to fit the
Chain 4: three stages of adaptation as currently configured.
Chain 4: Reducing each adaptation stage to 15%/75%/10% of
Chain 4: the given number of warmup iterations:
Again, with models m2, m3, and m4 we see the same error
message.
However, with m5 we do not get this same error message. This leads me
to believe (without yet checking the precis()) that the
acceptable warmup sequence for this model is somewhere between 100 ad
150.
Notice, once we use 500 (m6) now it says this error message:
Bulk Effective Samples Size (ESS) is too low, indicating posterior
means and medians may be unreliable.
Let’s look at the precis() for each model to find out
more.
precis(m1)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.406346 0 0.406346 0.406346 0.5012539 0.9989975
## sigma 1.461002 0 1.461002 1.461002 0.5012539 0.9989975
precis(m2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.04573657 1.4486169 -2.380043 2.324431 221.4205 0.9992402
## sigma 2.20240574 0.8495442 1.197153 3.613936 510.3069 1.0006596
precis(m3)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.2362235 1.4906129 -1.923891 2.897837 158.0945 1.0183978
## sigma 2.2219017 0.8753619 1.189871 3.919690 370.9363 0.9997707
precis(m4)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.2186743 1.5926411 -2.069571 2.833554 236.115 1.007410
## sigma 2.2440790 0.9383574 1.178105 4.008383 393.516 1.003975
precis(m5)
## mean sd 5.5% 94.5% n_eff Rhat4
## a -0.04448307 1.5615858 -2.450422 2.532589 255.7666 0.999606
## sigma 2.27044775 0.9220325 1.186984 4.037821 383.2022 1.003070
precis(m6)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.09142607 1.7707749 -2.669964 3.197862 250.5142 1.0031714
## sigma 2.28122910 0.9639211 1.207088 4.155355 340.0160 0.9990998
precis(m7)
## mean sd 5.5% 94.5% n_eff Rhat4
## a 0.1426471 1.6925422 -2.370414 2.710979 353.4644 0.9992345
## sigma 2.2538279 0.9513723 1.144860 4.083786 226.6008 0.9997694
precis(m8)
## mean sd 5.5% 94.5% n_eff Rhat4
## a -0.1983693 1.5457108 -2.469906 2.216117 139.7338 1.010758
## sigma 2.2317385 0.8899875 1.168764 3.818955 225.6213 1.004156
Looking at the n_eff values, at first we see the values
are very unstable and changing with every model considerable. After the
warning messages began to stop (at m4 and on) we can see the
n_eff values become more stable and do not change as
much.